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[57] ABSTRACT 

A fault tolerant communication arrangement, for 
switching parallel N-bit information among a plurality 
of stations, includes an M-bit crossbar switch, where M 
is greater than N by a number S of supernumerary or 
spare bit paths. At each station, an interface unit moni- 
tors for errors, and when an error is identified to a bit in 
the transmission path, routes the defective bit to one of 
the spare bit paths. All stations reroute data from the 
defective bit path to the same spare bit path. Error 
coding information is generated at the transmitting in- 
terface unit, and transmitted over some of the supernu- 
merary bit paths, and when the number of defective bit 
paths reduces the number of available supernumerary 
bit paths to zero, the bit intensity of the error coding is 
reduced, to free additional supernumerary paths. In a 
system in which some of the stations include memory, a 
failure of a memory bit at a particular address is, in 
effect, a failure of that bit in an overall transmission 
path. A memory sparing map keeps track of defective 
locations, and routes bits to other, non-defective mem- 
ory locations. 

7 Claims, 9 Drawing Sheets 
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1 2 

or shorted to ground in the switch or its interconnecting 

FAULT TOLERANT SWITCHED paths, or in the presence of error coding if a number of 

COMMUNICATION SYSTEM bit paths of the N bit paths, exceeding the number for 

which the error coding corrects, became inoperative, 

FIELD OF THE INVENTION 5 the entire communication system might become non- 

This invention relates to fault tolerant arrangements functional, 

for switched communications among a plurality of sta- F1G 2 illustrates an arrangement generally similar to 

tions by means of parallel digital signals. More particu- FIG. 1, in which two crossbar switches, designated 16a 

larly, the invention relates to reliability improvement by and 16b, are paralleled. In FIG. 2, elements correspond- 

the provision and control of redundant bit paths. 10 m S to those of FIG. 1 are designated by like reference 

numerals. In FIG. 2, each output port of crossbar 

BACKGROUND OF THE INVENTION swhch 16a ^ para iieled with the corresponding output 

It is often necessary to provide for information (data) port of the redundant crossbar switch 16b. For example, 

communication amongst any pair of a plurality of trans* data port 1 of the lower set of L ports of crossbar switch 

mit-receive stations. FIG. 1 illustrates, in simplified 15 16a is connected in parallel with data port 1 of the lower 

block diagram form, a communication system 10 includ- set of L ports of crossbar switch 16b, as exemplified by 

ing a plurality of transmit-receive (transducing) stations data path 221 1° which connects data port 1 of the upper 

(ST) 12i, 122, 123 .. . 12k, and another plurality of sta- set of ports of switch 16a to data path 15 1, and a similar 

tions 14j, 142, 1*3 . . • l*z,» interconnected by (N+Q-bit data path 22 ^connecting data port 1 of the lower set of 

data paths designated 13i, 132 . 13/c, 15i, 1$2 . . . 15^ 20 p0 rts of switch 16b to data path 15i. In this context, the 

and by a crossbar switch 16. Crossbar switch 16 in- term "parallel" means that each bit path of data path 

eludes a set of K+L data ports, including a plurality K 22 1° is connected to the corresponding bit path of data 

of ports, each of which is coupled by a data path 13i, 22j* The arrangement illustrated in FIG. 2 pro- 

132, ... , 13irto one of stations 12], 122, • • . 12#, respec- vides redundancy of the crossbar switch, so that a fail- 

tively, and also including a further plurality L of ports, 25 m of ^ swkch or Qf a pQrtion thereoff ^ over . 

each one of which is coupled by a data path 15 u 152, . by ^ of t he alternate or redundant crossbar 

. . , 15i to one of stations 14j, 142, • • • 14i* respectively. sw itch 

As illustrated in FIG. 1, each data path 13, IS includes The* arrangement of FIG. 2 does, however, have 

anumWN+Cofparalleldata paths, N ofwhich carry ^ limita tions in ±e level of achi evable redundancy, 

digital signal bite of sipuficance ranguig from ^anle^t- 30 ^ ^ nt ofFIG . 2 provides for redundancy of 

m ^f B) Z 3 ^^JT T * ^ paths such as data path 22,- and 22,* in that an 

one of toe N to paths associated witti each station < ^ telit failure m on tofthe data paths can be over- 

carries the LSB, another of the N paths carries the . . , . A Al _ . . f , u 

MSB, and each of the other N bit paths carries bits of a swltc ^« to ^e redundant crossbar switch, 

particular significance lying between the LSB and 35 ***** switche / data P ath 2 \ However a 

MSB. The C bit paths are used for error Coding, such as short-circuit or inadvertent mterconnection of one bit 

for error detection and correction (EDAC) or parity P at * of a ^ P ath , to anoth f blt P ath °] the ^ f ata f 

coding. For example, station 12i communicates through P ath ' or to ground, cannot be corrected, as a result of 

data path 13,, by means of an (N+Q-bit digital signal, «J* connections of data path 22,* to 22i> Also, 

including an LSB and an MSB, each of which is carried 40 the arrangement of FIG. 2 by implication requires some 

by a separate bit path (ordinarily one conductor wire) of means for detecting the existence of a failure associated 

the N portion of data path 13,, and carries error coding with ^ crossbar switch. In the simplest situation, this 

bits in the C portion of data path 13i. Similarly, station ^S 1 * involve a human operator who observes the sys- 

14l communicates by means of an N-bit digital signal *«* who » in response to an overt system problem 

and C error coding bits through data path 15i, which 45 such as a broken or failed bit or data path (wire or 

has (N+C) bit paths. It should be noted that some or fiberoptic cable), or in response to inappropriate sys- 

many prior art communications systems may dispense tern behavior, controls the system so as to operate with 

with error coding, whereupon C=0, and communica- the alternate crossbar switch. Faster and more reliable 

tion system 10 of FIG. 1 becomes an N-bit system. operation might be achieved with an automatic error 

Crossbar switch 16 as illustrated in FIG. 1 includes a 50 detection system, for detecting the presence of errors by 

plurality 1, 2, 3 ... K "upper" ports connected to sta- comparison of parity bits or the like. If such an error 

tions 12i, 122, 123 .. . 12jc» and includes a further plural- detection system were associated with one of switches 

ity of "lower" ports 1, 2, 3 ... L, which are connected 1&* or 16b of FIG. 2, a single failure in the error detec- 

by way of data paths 15 to stations 14, , 142, 14* . . . 14/,. tion system itself might result in an inability to switch in 

The separate designations should not be construed to 55 the presence of a failure in the data paths. The provision 

mean that there is any difference among the ports. Thus, of an additional crossbar switch may not be the most 

there is no necessary difference among any of the sta- cost-effective way to provide fault tolerance in such a 

tions 12 and any of the stations 14, and they could all system, and may also adversely affect system perfor- 

have easily been designated by a single reference nu- mance due to additional signal loading attributable to 

meral, such as 12, with a different set of subscripts. 60 the parallel connections. 

Similarly, there is no difference among any of the infor- An improved multiple station communication system 

mation ports of crossbar switch 16. for parallel digital signals is desired. 

d^J3K££ STK^KSft SUMMARY OF THE INVENTION 

failure of a single component or device. For example, in 65 A fault tolerant system for communicating among 

the arrangement of FIG. 1, if crossbar switch 16 became plural stations, each of which transduces (transmits 

inoperative, as might occur in the absence of error cod- and/or receives) N-bit parallel digital information sig- 

ing if even one bit path of the N bit paths became open nals, includes a switching system capable of switching 
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M parallel bits, where M>N. Fault tolerance is pro- simplified block diagrams of error encoding and decod- 

vided by associating each N-bit station with a controlla- ing portions of the arrangement of FIG. 4a, respec- 

ble or controlled interface unit or multiplexing scheme tively; 

which interconnects the N-bit station with the M-bit FIGS. 5a, 5b and 5c together constitute a logic flow 
switching system. The controlled interface unit associ- 5 chart illustrating control of the arrangement of FIG. 3 
ated with each station routes the N bits to be transmit- and FIGS. 4a, 4b, 4c, 4d> and 4e in one method accord- 
ted by a station over N operable bit paths of the M-bit ing to the invention; 

switching system, bypassing any nontransmissive bit FIG. 6 is a simplified block diagram of a system ac- 
paths by use of one or more supernumerary (S), spare or cording to an aspect of the invention, which is similar to 
extra M-N bit transmission paths of the switching sys- 10 FIG. 3, but in which some of the stations and their 
tem. In a receiving mode of the controlled interface unit interface units have been replaced by individual mem- 
associated with each station, those bit signals received ory banks, which are subject to column defects; 

S££^^TTSZ£ DESCRIPTION OF THE INVENTION 

interface unit back to the bit paths of appropriate signifi- 15 FIG. 3 is a simplified block diagram of a communica- 
cance of the receiving station. In a particular embodi- tions system 100 in accordance with an aspect of the 
ment of the invention, the switching arrangement is a invention. In FIG. 3, elements corresponding to those 
crossbar switch. In an embodiment of the invention, of FIG. 1 are designated by like reference numerals. In 
ED AC error detection and correction is used under FIG. 3, the crossbar switch is designated 116 rather 
normal operating conditions, with the additional code 20 than 16 as in FIG. 1, because it differs from crossbar 
bits required for the ED AC routed over a plurality of switch 16 of FIG. 1 by having additional supernumer- 
the spare, additional or supernumerary bit paths; when ary (S) bit paths over the (N+C) bit paths of switch 16. 
all of the supernumerary bit paths are in use for carrying As illustrated in FIG. 3, crossbar switch 116 is capable 
bit data, due to defects in the main data paths, the occur- of M-bit operation, where M >N. It should be under- 
rence of an additional defect switches operation from a 25 stood that the M bits may include C additional overhead 
more bit-intensive error coding to a less bit-intensive bits, such as error coding or parity bits. The number of 
coding, as from EDAC to parity, thereby freeing addi- spare or supernumerary bits might be considered to be 
tional ones of the supernumerary bit paths for carrying S=M— (N+C), where the hyphen represents subtrac- 
defective bits. The occurrence of further defects, over tion, or more simply M=N+C-f-S, but for reasons 
the number of additional defects in the main data-carry- 30 described below, the supernumerary bit paths may be 
ing paths which fully use the additional ones of the considered to include the coding bit paths and true 
supernumerary bit paths freed by switching from supernumerary paths, so that S=M— N, or S— C+Sr- 
EDAC to parity, is handled by further reducing the As a particular simple example, M might equal seven- 
intensity of the coding or deleting error coding alto- ty-three, of which the actual data bits N might be sixty- 
gether, as by deleting parity coding, and using the su- 35 four, the EDAC error coding bits might be eight, and 
pemumerary bit paths freed thereby to carry the further the true supernumerary bits would in that case be one, 
defective bit paths. The occurrence of further defects in meaning that the width M of the signal (as opposed to 
the bit paths can be detected, after the EDAC and par- overhead or supernumerary bits) path is one bit greater 
ity coding is eliminated, by the use of test transmissions. than the minimum required to carry a 64-bit signal with 
In another embodiment, the system includes a mem- 40 eight-bit error coding, so that S= 1. It should be under- 
ory made up of plural memory bank sections or pages, stood that, in general, the reliability of the system will 
or interleaved memory structures, in which different improve as the number of true supernumerary bits St 
pages may have defective memory locations at various increases. A desirable number of true supernumerary 
different addresses, and in which the controlled inter- bits for use with data bits N equal to sixty-four and error 
face units each include further memory, which is pro- 45 coding bits C equal to eight might be Srequal to twelve, 
grammed to reset the state of the interface "column whereby the total number of bits M would be eighty- 
spare** multiplexers in response to the memory pages four, of which twenty would be supernumerary bits, 
being addressed. In FIG. 3, an N:M controlled interface unit 102* 

DESCRIPTION OF THE DRAWINGS „ 2«J S ^ ^TtalT "5££ 

FIG. 1 is a simplified block diagram of an (N+ C)-bit M=(N-r-C+Sr). and a similar controlled interface unit 

prior-art communication system including a plurality of 104 is associated with each station 14. For example, a 

transmit-receive stations interconnected by way of a controlled interface unit 102i is coupled by way of an 

crossbar switch; (N+C+$7)-bit data path 113j to a port of crossbar 

FIG. 2 is similar to FIG. 1, and includes a redundant 55 switch 116, and is also coupled by way of a N-bit data 

crossbar switch paralleled with the first; path 13] with station 12i. Similarly, a controlled inter- 

FIG. 3 is a simplified block diagram of a communica- face unit 102^ is connected with a port of crossbar 
tion system according to the invention, including a switch 16 by way of a (N + C+Sr)-bk data path 113k, 
plurality of N-bit stations and an M-bit crossbar switch, and to a station 12# by way of an N-bit path 13*. Con- 
where M is greater than N, and including a controlled 60 trolled interface units 104* are similarly connected by 
interface unit 102* or 104* associated with each station; (N+C+Sr)-bit data paths 15* to ports of switch 116, 

FIG. 4a is a simplified block diagram of a controlled and by way of N-bit paths 115* to corresponding sta- 

interface unit according to the invention, which may be tions 14*, where subscript X represents any one of the L 

used in a communication system such as that of FIG. 3; stations 14. As described below, one or more of stations 

the controlled interface unit includes two different 65 12* or 14* may include RAM or ROM memory, which 

types of multiplexers, FIGS. 4b and 4c are simplified may represent one or more pages of global memory, 
block diagrams of multiplexers which may be used in In operation of the arrangement of FIG. 3, a transmit- 

the arrangement of FIG. 4a, and FIGS. 4d and 4e are ting station 12 (or 14) produces N-bit signals to be trans- 
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mitted to another station via the communication system, 
and applies those N bits to its corresponding controlled 
interface unit 102 or 104. For example, station 14i may 
produce N data bits on data path 115 1 for ultimate trans- 
mission to station 12& and applies those N bits over data 5 
path 115 1 to interface unit 104i. Interface unit 104] is 
controlled, in conjunction with other similar interface 
units, by a control unit, processor or computer illus- 
trated as a block 8. Control block 8, as further described 
below, is connected by a bus 7 to each of the controlled 10 
interface units 102% 104*, and monitors the status of the 
transmission paths extending through the crossbar 
switch, including transmission paths such as path 15] 
and 113*, to determine the existence of failures to trans- 
mit among the bit paths of the data path. Control block 15 
8 operates by collecting error status from the error 
detection and correction (EDAQ or parity coding 
portions of the various interface units 102 and 104. If all 
N+C +Sr bits of the transmission paths, including the 
path through switch 116, are operable, the N-bit signal 20 
applied from a station 12 or 14 to the associated con- 
trolled interface unit 102^ or 104^, respectively, is then 
applied over the N-bit portion of the (N+C+S) system 
bit path, whereby the S r supernumerary bit paths of the 
transmission path are unused. 25 

In the event that one of the N data bit paths within 
the data path extending from interface unit 104i of FIG. 
3 to interface unit 102a- is determined by control block 8 
to be not transmitting or to be nonfunctional, such as 
might occur in the event of an open-circuit condition of 30 
that particular bit path, or due to a malfunctioning (so- 
lid-state) switch contact, control unit 8 identifies the 
defective bit path to interface units 104i and 102^- As 
described below, interface unit 104] is then recon- 
figured, generally speaking, to couple the N data bits 35 
from the associated station 14 onto those of the 
N+C+Srdata paths extending through data path 15i, 
switch 116 and data path 113/: as are operational. One 
way to accomplish this is to couple those data bits, 
which would otherwise be transmitted over the defec- 40 
tive bit path or paths (which we may call "defective" 
data bits) onto the true supernumerary or extra 
M— (N+Q data paths in data path 15i, switch 116 and 
data path 113*. For example, if the number of true 
supernumerary bits M— N is twelve as suggested in the 45 
above example, the first of the twelve true supernumer- 
ary bit paths is selected to carry the defective bit At the 
same time that the defective data bits are coupled onto 
the supernumerary data paths in controllable interface 
unit 104 1, all other interface units, including unit 102*, 50 
are reconfigured in response to the information from 
control block 8 to accept the N bits from the bit paths 
extending through the communications path, corre- 
sponding to the bit paths selected by interface unit 104]. 
More particularly, the communication system carries 55 
the N— 1 defective bits over the N regular bit paths, one 
of which is defective, and carries the defective bit over 
one 6f the Srbit paths. Thus, failed bit path(s) extending 
through the communications system is (are) bypassed 
by extra bit path(s) associated with the communications 60 
system, so that no single failure of a bit path through the 
transmission system can result in a failure to communi- 
cate. More generally, a number of bit path failures equal 
to St can be accommodated without any degradation 
whatever to the system performance. 65 

Crossbar switch 116 of FIG. 3 and stations 12* and 
14jtare conventional. FIG. 4a is a simplified block dia- 
gram of a controlled interface unit of FIG. 3. For defi- 



,249 

6 

niteness, FIG. 4a represents controlled interface unit 
104j of FIG. 3. In FIG. 4a, an M-bit interface port 101 
having M=(N+C+Sr) bits, at the top of the FIG- 
URE, connects to data path 15 1 and from there to a port 
of switch 116 of FIG. 3. At the bottom of FIG. 4a, an 
N-bit interface port 105 connects to data path 115} and 
thence to station 14| of FIG. 3. Data flows from inter- 
face port 101 to interface port 105, within interface unit 
104, by way of a data path designated 108, a multiplexer 
(MPX) block or unit 106, a further data path including 
a transmission path 110, an error decoding and data 
correction block 112, and a data path 114. Data flows 
from interface port 105 to interface port 101 by way of 
a data path including transmission path 114, its continu- 
ation transmission path 128, an error coding block 126, 
an (N+C)-bit transmission path 124, a multiplexer 
block 120, and a transmission path 122 with 
M=N+C+ Sj-bits. Multiplexer blocks 106 and 120 are 
described in more detail in conjunction with FIGS. 4b 
and 4c, and error decoding and encoding blocks 112 and 
126, respectively, are described in more detail in con- 
junction with FIGS. 4d and 4e. 

FIG. 4b is a simplified block diagram of multiplexer 
106 of FIG. 4a. In FIG. 4b, elements corresponding to 
those of FIG. 4a are designated by like reference nu- 
merals. In FIG. 4b, multiplexer 106 includes a plurality, 
equal to (N+Q, of (Sr+lHnput, single-bit-output 
multiplex switches or units 206], 2062, 2O63, . . . 206jv+ 1, 
206w+ 2i • . - 206(7v+i)+c This type of multiplex unit is 
ordinarily known as a "one-of-N" multiplexer, but this 
terminology might cause confusion, so they are termed 
"one-of-many" multiplex units herein. There are 
(N+C) multiplex units 206 within multiplexer 106 of 
FIG. 4b, one for each data and error coding bit In 
general, each multiplex unit 206 has a number of single- 
bit input ports equal to (C+Sr+ 1), that is, equal to one 
more than the sum of the number of error coding bit 
paths and the number of true supernumerary bit paths, 
generally as illustrated in conjunction with multiplex 
unit 206i in FIG. 4b. In the abovementioned example in 
which N= sixty-four, C=eight, and S 7-= twelve, so 
C+S=20, and (C+ S+ 1)<=21. Thus, each one-of-many 
multiplex unit 206 in the example would be a one-of- 
twenty-one (21:1) multiplex unit, and there would be 
N+C=64+8=72 such multiplex units in FIG. 4b. 
Each bit multiplex unit also includes an output port, and 
further includes a control port coupled to a command 
or control bus 148 for control of the state of each multi- 
plex unit 206 independently of the state of any other 
multiplex unit Each bit multiplex unit 206 of FIG. 4b 
(including multiplex unit 2060 therefore includes, in the 
simplest case of one supernumerary bit and no error 
coding, at least two input ports, one for the data bit of 
a particular significance and the other for the spare bit 

While any bit path of the N-bit data portion of data 
path 108 coupled to a multiplexer 106 may be assigned 
to carry bits of any significance, the simplest arrange- 
ment is to apply the least significant bit (LSB) of the 
data signal arriving at column spare multiplexer 106 
from data path 108 by way of one-bit data path 205] to 
the left (L) input port of multiplex unit 206i of FIG. 4b, 
thereby leaving C+S ports of multiplex unit 206 1 avail- 
able for the error coding (C) bits and (S) bits. The error 
coding bits and any other signal arriving by way of 
C+S portion of data path 108 are applied to the C+S 
right (R) input ports of multiplex unit 206i. The second- 
least-significant-bit of the N-bit data signal is applied to 
the L input port of multiplex unit 2O62, thereby leaving 
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C+S ports of multiplex unit 2Q6i available for the error 
coding and supernumerary bits. The third -least-signifi- 
cant-bit of the N-bit data signal is applied to the L input 
port of multiplex unit 2O63. Similarly, the most signifi- 
cant bit (MSB) arriving on data path 108 is applied by 
way of one-bit data path 205# to the L input port of 
multiplex unit 206j\r+c Bits of other significance are 
applied by way of other one-bit data paths 205 to the L 
input ports of other multiplex units 206 lying between 
multiplex units 2O63 and 2Q6& More specifically, the 
1 cast-significant-bit of the C error coding bits is applied 
to the L input port of multiplex unit 20<w+i, and the 
most-significant-bit of the C error coding bits is applied 
to the L input port of multiplex unit 206#+c. Error 
coding bits of other significance are individually applied 
to the L input ports of other multiplex units lying be- 
tween 206^+1 and 206at + c- The C error coding bit 
paths, and the extra, spare or supernumerary (S) data 
path(s), are applied to the right (R) input ports of all 
multiplex units 206. Thus, if there were C=8 error 20 
coding bit paths and S= 12 supernumerary bit paths, 
each multiplexer 206 would have twenty R ports, and a 
particular one of the error coding or supernumerary bit 
paths would be connected to the same one of the R 
ports of each of multiplex units 206i through 206^+0 
In operation of multiplexer 106 of FIG. 4b t multiplex 
unit 206i is normally (in the absence detection of a sys- 
tem failure) controlled to a state in which the signal 
applied to its L input port from one-bit LSB input signal 
path 205] is coupled by way of its one-bit output port to 
one-bit LSB output signal path 208 1, and any signal 
applied to its other or S input ports (a single other input 
port in the S= 1 example, and twenty other input ports 
in the S= 20 example) is blocked, and cannot pass. Mul- 



25 



30 



merals. In FIG. 4c, (N+C)-bit input data path 124 is 
coupled directly to M-bit output data path 122 by an 
interconnecting N-bit bypass data path 221. This ar- 
rangement allows the N-bit data signal portion of an 
N-bit data signal with C-bit error coding arriving on 
data path 124 to be coupled directly to an N-bit portion 
of M-bit data path 122 under normal (no defective bit 
paths) conditions. There are C+Sr one-of-many multi- 
plex units 220 in multiplexer 120 of FIG. 4c. In particu- 
lar, FIG. 4c illustrates one-of-many multiplex units 220 1, 
2262 - . - 220c+sr- Since there are C+Srmultiplex units 
220 in multiplexer 120 of FIG. 4c, there is one multiplex 
unit 220 for each bit of the C-bit error coding signals 
arriving at multiplexer 120 over data path 124 from 
error coding block 126 of FIG. 4a % and St additional 
multiplexers 220. Each one-of-many multiplex unit 
220i, 2202 . . . 220c+ srhas a single output bit path 222 1, 
2202, . • . 220c+5, respectively, which couples to one bit 
path of an S-bit supernumerary portion 222&of data path 
122, where S=C+Sr, and thence to one bit path of 
data path 108 of FIG. 4a. Each multiplex unit 220 l, 
2202 . . . 220c+srof FIG. 4c also has an input data path 
224 including N+C bit paths, which is coupled to 
source N+C data path 124, and in parallel with the 
corresponding bit paths of the input data paths of all 
other one-of-many multiplex units 220j, 2202 • • • 
220c+sr» and of which, as mentioned above, N bits are 
also coupled to the corresponding bits of N-bit bypass 
data path 221. Each one-of-many multiplex unit 220], 
2202 . . . 220c+sr has a blocking state, in which all 
inputs are inhibited or blocked from proceeding to its 
output bit path 222 1, 220& . . . 220c+s, respectively, and 
also has an unblocked or transmissive state, in which it 
controllably selects, from among all of its N+C input 



tiplex unit 206jv+c is normally controlled to a state in 35 bit paths, one of the bit paths for application to its single 
which the signal applied to its L input port from one-bit output bit path 222), 2202, . . . 220c+S- 
error code MSB input signal path 205jv+cis coupled by Under normal conditions, in which the bit paths in 
way of its one-bit output port to one-bit MSB output data paths 122 and 108 of FIG. 4a are unbroken, and the 
signal path 208at +c . The other multiplex units 2O62, corresponding bit paths 15 1, and through crossbar 
2O63 . . . 206;v+c-i are similarly controlled to couple 40 switch 116 of FIG. 3, are unbroken, the LSB, MSB and 



the bits of other significance applied to their L inputs by 
way of their output ports to paths 208 of other signifi- 
cance. Thus, under norma] conditions, the N-bit data 
input signal and associated C-bit error codes (a total of 
N+C bits) received over data path 108 (N+C+S bits 
wide) are individually coupled, by way of the L ports of 
the (N+C) multiplex units 206i-206w+c> where the 
hyphen represents the word "through", to data path 
110, and the true supernumerary bit paths are not used. 



bits of intermediate significance applied over data path 
124 of FIG. 4a to multiplexing unit 120 may be coupled 
by data path 221 of FIG. 4c to corresponding bit paths 
of output data path 122. Also under normal conditions, 
45 C of the one-of-many multiplex units 220i, 2202 . . . 
220c+srare transmissive, coupling the C error coding 
bits from input data path 124 to C of the S=C+St* 
supernumerary data paths in M-bit output data path 122. 
The remaining Srtrue supernumerary bits of the output 



Control is accomplished with the aid of known error 50 data path 122 are not used, and the corresponding ones 



detection schemes, as described below. Thus, the N 
multiplex units 206i-206^of FIG. 4b couple the N data 
bits arriving at their L inputs onto the N portion of the 
N+C output data path 110, and the C multiplex units 
206;v+ i-206^ + c couple the C error coding bits arriving 
at their L input ports to the C portion of output data 
path 110, for application to data correction block 112 of 
FIG. 4a, all under the control of instructions received 
from a command or control interface block 134 of FIG. 



of multiplex units 220 of FIG. 4c are in a blocking state. 

In the event that a particular system bit path is deter- 
mined to be nontransmissive, control signals on control 
bus 148 of FIGS. 4a, 4b and 4c are readjusted by control 
55 interface block 134 of FIG. 4a to cause a corresponding 
one-of-many multiplex unit, such as multiplex unit 
220c+57 j of FIG. 4c, to route the signal bit onto one of 
the S spare bit paths of M-bit data path 122. The defec- 
tive data bit is preferably routed onto one of the Srbit 



4a over control bus 148. Control interface block 134 of 60 paths, if available., so that the C error coding bits con- 



controlled interface unit 104 1 is coupled, together with 
all corresponding control interface blocks in other con- 
trolled interface units 102;r, 104*, to control block 8 of 
FIG. 3, for correlated or overall control, as described 
below. 

FIG. 4c is a simplified block diagram of multiplexer 
120 of FIG. 4a. Elements of FIG. 4c corresponding to 
those of FIG. 4a are designated by like reference nu« 



tinue to be transmitted. Suppose, for example, that the 
second-most-significant-bit (SMSB) path of M-bit data 
path 122 were nontransmissive, but all other bit paths 
were transmissive; the SMSB bit would then not be 
65 arriving at the destination at the remote end of data path 
122. When this condition is detected, as described be- 
low, one of the one-of-many multiplex units 220 of FIG. 
4c, as for example multiplex unit 220 1, would be placed 
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in its transmissive state, selecting the SMSB input to couples either the C\ ED AC bits from its L port or the 

couple to its spare output bit path 222 1, designated spare Ci parity-plus-zeroes bits from its R port to C-bit output 

bit path 1 or <Si> in FIG. 4c. Since spare bit path path 242, under the control of commands applied from 

< Si > is presumptively functional, the full N-bit signal control interface block 134 of FIG. 4a over control bus 

arriving over data path 124 would appear at the remote 5 148. Thus, C-bit path 242 of FIG. 4d carries C\ EDAC 

end of data path 122, with the LSB, MSB and all bits of data bits, eight in number in the example, so long as the 

other significance occupying their normal positions in number of defective data paths in the communication 

the N-bit portion of M-bit data path 122, but with the system does not exceed the number S^of true supernu- 

SMSB appearing on bit path < Si > of the S-bit portion merary bits, and when the number of defective data bits 

of M-bit data path 122. Naturally, if only one supernu- 10 exceeds St, path 242 carries C2 parity bits, together 

merary bit path is available, no further failures can be with additional "empty" bits, which in the example is 

accommodated. If, on the other hand, there are a plural- two parity bits and six zeroes. The C-bit signal on data 

ity S of supernumerary bit paths in the S-bit portion of path 242 joins the N-bit data on bypass path 229 to form 

M-bit data path 122, and if, in addition, there is a like (N+C)-bit signal on output data path 124 for applica- 

number of multiplex units 220 connected as depicted in 15 tion to multiplexer 120 of FIG. 4a 

FIG. 4c, then as many as S transmission failures can be FIG. 4e is a simplified block diagram of error decod- 

simultaneously accommodated. ing block 112 of FIG. 4a. In FIG. 4e, N-bit data, to- 

In the abovementioned example, N equals sixty-four gether with Cj- or C2-bit error coding data, depending 
data bits, C equals eight EDAC error coding bits, and S upon the number of defective data bits and the resulting 
equals eight error coding bits plus twelve true supernu- 20 operating mode of encoder 126 of FIG. 4d t is applied 
merary bits St- Assume that twelve of the sixty-four over (N+C)-bit data path 110 to error decoder 112 of 
data bits have become defective, and are currently FIG. 4e. In normal operation with no defective data 
being routed through the true supernumerary bit paths. bits, the N-bit data is applied to an EDAC decoder 252, 
It might seem that no further errors could be accommo- an error correction block 254, a parity decoder block 
dated by the system. According to another aspect of the 25 258, and to the L port of a multiplexer 256. However, 
invention, however, additional bits are made available parity decoder block 258 is disabled in the normal oper- 
for carrying defective data bits, by changing from one ating mode, or its output is not used if it is enabled, 
type of error coding to another type which requires Instead, EDAC decoder 252 operates on the received 
fewer bits. For example, the EDAC error coding could data and error codes, and generates error information, 
be changed to parity coding, which requires fewer bits. 30 which is applied to data correction block 254 to enable 
Of course, changing from EDAC to parity coding elim- block 254 to correct the data. EDAC decoding block 
inates the EDAC capabilities, replacing them with the 252 also produces fault status information such as error 
capabilities of parity coding. Thus, the EDAC capabili- presence and location information, which is applied 
ties of error correction and direct identification of the over a path 253 to control bus 148, for transmission to 
defective bit location are given up, and replaced by 35 control interface block 134 of FIG. 4a. As mentioned, 
simple identification of the existence of an error in the the uncorrected data is applied from data correction 
transmission. If EDAC coding requires eight bits, as in block 254 by an N-bit path 255 to the R input port of a 
the above example, and parity coding of two 32-bit multiplexer 256. The uncorrected data from an N-bit 
blocks of the 64 data bits requires two bits, six bits can bypass bus 251 is applied to the L input port of multi- 
be freed for use in carrying additional defective data 40 plexer 256. In normal operation, the corrected data 
bits, thereby raising the total number of defective data from block 254 is preferred, so multiplexer 256 is corn- 
bits which can be accommodated from twelve to eigh- manded to couple its R input port to its N-bit output 
teen. As described below, test transmissions are com- data path 114. The operation of block 112 of FIG. 4e is 
manded in response to error identifications by the parity the same as that described above, so long as the number 
coding, to thereby determine which data bits are defec- 45 of defective data bits does not exceed the number Srof 
tive. true supernumerary data paths, because the defective 

FIG. 4d is a simplified block diagram of EDAC/- bits are rerouted by multiplexer 106 of FIG. 4a before 
parity encoding block 126 of FIG. 4c. In FIG. 4d, N-bit they get to error decoder and data correction block 112. 
data applied over data path 128 is applied by an N-bit In the event that the number of defective data paths 
bypass data path 229 to the N-bit portion of (N+Q-bit 50 exceeds the number Srof true supernumerary bit paths, 
output data path 124. Since the error coding circuits error encoder 126 of FIG. 4a is commanded to encode 
must know what the data bits are in order to perform parity rather than EDAC, and, in a similar fashion, 
the coding function, the N-bit data is also applied to the commands are applied over command bus 134 to 
inputs of an EDAC coding block 230 and a parity cod- EDAC decode block 252 and parity decode block 258 
ing block 232 of a coding arrangement 23L EDAC 55 of FIG. 4e, to disable EDAC decoding, and enable 
coding block 230 and parity coding block 232 are cou- parity decoding. In the example, the sixty-four bit data 
pled to branches of control bus 148,, to receive enable signal may have parity applied to two thirty-two bit 
and disable commands from control interface block 134 blocks, such as the LSB and MSB blocks, which pro- 
of FIG. 4a. When enabled, EDAC coding block 230 duces the two parity bits of the example. The parity 
produces its error coding, with Ci bits, which in the 60 decoder cannot produce enough information to allow 
example is eight bits, on Ci-bit data path 234 for applica- error correction, but simply identifies the presence or 
tion to the Ci-bit left (L) port of a multiplexer 240. absence of an error in the data block. Since the parity 
Similarly, when enabled, parity encoder 232 produces decoding does not correct the data, multiplexer 256 is 
its error coding, with C2 bits, which in the example is commanded to switch, and couple the data from its L 
two bits, on C2-0U data path 236. Data path 236 is joined 65 input port to output path 114. The data applied to the L 
by additional "0" or logic low bits, sufficient in number input port, as mentioned above, is the uncorrected N-bit 
to make the total number of bits applied to the right (R) data from bypass path 251. This allows uncorrected 
port of multiplexer 240 equal to d. Multiplexer 240 data to flow through the controlled interface unit to the 
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utilizing station, but the presence of an error is signalled use is less than the number available, a supernumerary 

by parity decoder 258, so that other measures can be bit path is available, and the logic leaves decision block 

taken, such as retransmission of the message, to con- 326 by the YES output, and proceeds to a logic block 

tinue operation. Allowing uncorrected data to flow 328. Logic block 328 represents the assignment of the 

through the system in this manner is considered prefera- 5 next supernumerary bit path in sequence to the current 

ble to complete cessation of operation, as would occur defective bit, and commanding all the multiplexers in all 

if there were no additional functional data paths beyond the controlled interface units to switch accordingly, 

the S t true supernumerary bit paths. From block 328, the logic flows back to block 314 by a 

FIGS. 5a, Sb and 5c together constitute a flow chart logic path 330. 
illustrating one scheme for controlling the communica- 10 If all the true supernumerary bit paths are in use, the 
tion system of FIGS. 3 and 4a, 46, 4c, 4d t and 4* in logic leaves decision block 326 of FIG. 5a by the NO 
accordance with an aspect of the invention. In FIG. So, output, and proceeds to a block 332. Block 332 repre- 
the logic flow starts at a START block 300, and flows sents disabling of the EDAC encoding and decoding, 
to a decision block 302, which reads a "previous config- and the enabling of parity encoding and decoding, in 
uration" flag. If a previous configuration does not exist, IS order to make a number of additional supernumerary bit 
the logic flows to a block 312, which represents initial- paths available. In the numerical example, the EDAC 
ization of all registers in the system to a nominal condi- used eight bits, and the parity encoding only two bits, 
tion, such as by setting all error logging registers to thereby freeing six additional supernumerary bit paths 
zero. Block 312 also represents the enabling of EDAC for use in carrying defective bits. From block 332, the 
error coding and decoding, and disabling of parity cod- 20 logic flows to a further block 334, which assigns the 
ing. If a previous configuration exists, the logic leaves next available one of the additional supernumerary bits 
decision block 302 by the YES output, and flows to a to the current defective bit, and instructs the multiplex- 
block 304, which represents reading of a nonvolatile ers in all the controllable interface units to switch con- 
store containing information about the previous config- figuration to route the current defective bit through the 
uration. From block 304, the logic flows to a block 306, 25 newly freed supernumerary bit path, 
which represents reconfiguring the EDAC or parity When the EDAC error coding has been disabled and 
conditions, and the spare configurations of the station the parity coding enabled as a result of the presence of 
interface units. Whether or not a previous configuration a number of defective bits exceeding the number of true 
exists, the logic flows from either block 306 or 312 to a supernumerary bit paths, operation of the system using 
block 314, which represents the initiation of test trans- 30 EDAC coding, as contemplated in the flow chart per- 
missions if data transmissions are not taking place. tion of FIG. 5a, is no longer possible. From block 334 of 
Block 316 represents the reading or polling of all error FIG. Sa, the logic flows to a block 336, corresponding 
logging registers. When the registers have been read, to A block 336 of the flow chart portion of FIG. Sb. 
the logic proceeds to a decision block 318, in which the If the communication system control logic is started 
presence or absence of an error redirects the logic flow. 35 in the parity error coding mode of operation, the logic 
In the absence of an error, the logic leaves decision begins at a block 350 of FIG. Sb, and flows to a block 
block 318 by the NO output, and flows back to block 352, which represents the initialization of all the fault 
314 by way of a logic path 320. When an error is identi- logging registers, enabling of parity encoding and de- 
fied, the logic leaves decision block 318 by the YES coding, and disabling the EDAC coding and decoding, 
output, and arrives at a block 322, representing reading 40 From block 352, the logic flows to block 354, which is 
of the error logging registers associated with the error, also the starting point for logic transfer from the flow 
and determining the error bit location, as by evaluation chart portion of FIG. Sa. In the parity encoding mode, 
of the Hamming Error Syndrome associated with the operating data is transferred among stations, or in the 
EDAC coding. From block 322, the logic flows to a absence of operating data, test data is transmitted, ac- 
block 324, which represents the determination of the 45 cording to block 354. Logic block 356 represents read- 
number of true supernumerary bit paths which are in ing the error logging registers. From block 356, the 
use. A nonvolatile store, which may be located in con- logic arrives at a decision block 358. Decision block 358 
trol unit 8 maintains a log of the failure locations. When reroutes the logic according to the presence of absence 
the system is initially turned on, there may have been a of an error in the last data transmission. If no error is 
large number of defects which have arisen as a result of 50 identified, the logic flows back to block 354 by way of 
years of operation in an adverse environment To avoid logic path 360. If a "hard" error (a permanent error) is 
having the system re-identify all the errors, and perform identified, the logic flows to a block 362, which repre- 
all the reconfigurations, the control unit re-establishes sents reading the error logging registers associated with 
the prior configuration. In some applications, it may be the error, to determine, for example, which of the 
possible that the failures will "heaT themselves, in 55 thirty-two bit LSB or MSB blocks of data contained the 
which case, the system can start from a "virgin", non- error. As illustrated, the control system assumes that a 
reconfigured condition, and perform the reconfigura- single error constitutes a hard error, but if single event 
tion as the errors are detected, either through normal upsets (SEUs) are expected, control interface 134 may 
data transfers or through special testing. Assuming that count the errors, and form a fault-to-good transmission 
the supernumerary bit paths are assigned in sequence to 60 ratio, whereupon a "hard" fault is represented by a ratio 
correction of errors, knowledge of the number of super- which exceeds a threshold value. Block 364 represents 
numerary bit paths in use also identifies the next one to the enabling of test transmission generator 160 of FIG. 
be used. If an ordinary numerical sequence is not used 4a, to cause it to send test transmissions in the appropri- 
for some reason, the available bit paths must also be ate LSB or MSB portion of the data path, through error 
determined. From block 324, the logic flows to a deci- 65 encoder 126, multiplexer 120, and out interface port 101 
sion block 326, which compares the number of true to crossbar switch 16 of FIG. 3. At each receiving 
supernumerary bit paths in use with the available num- interface unit, a test transmission checker 161 receives 
ber of true supernumerary bit paths. If the number in the test transmissions, evaluates them and supplies the 
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result to interface block 134. The crossbar switch can be error is identified, the logic leaves decision block 416 by 
stepped to couple the test signal to all possible con- the YES path, and arrives at a block 420. Block 420 
trolled interface units, to test all paths from the source represents the running of diagnostic tests between sta- 
of the test signals, or, if the source of the data signals in tions reporting errors, in order to identify defective bits, 
which the error occurred is known, the crossbar switch 5 Block 422 represents the reading of error logging regis- 
is set to couple the test signal through that same path. ters which record the errors arising from the diagnostic 
The logic flows from block 364 to decision block 365, tests of block 420, and block 424 represents the determi- 
which determines if an error has been identified. There nation of the number of supernumerary bits available. In 
is the possibility that the test will show no errors, in the above described example in which the data was 
which case any error is assumed to have been an SEU. 10 sixty-four bits, true supernumerary bits was twelve, 
If no error was identified, the logic leaves decision EDAC was eight bits, and parity coding occupied two 
block 365 by the NO output, and returns to block 354. bits, only one supernumerary bit can remain at this point 
If an error was identified, the location of the defective in the flow chart, because parity was disabled in block 
data bits is determined. The logic then flows to a block 374 of FIG. Sb, freeing two bits for supernumerary use, 
366, which represents the determination of the number 15 and one of those bits was immediately assigned to the 
of the currently available supernumerary bits. These are then current defective bit, leaving only one remaining 
the bits which were previously freed for use by switch- supernumerary bit. However, for generality, the evalua- 
ing from EDAC to parity error coding. By the time tion can be performed. If a bit is available, the logic 
block 366 is reached for the first time, one of those leaves decision block 426 by the YES output, and pro- 
supernumerary paths has already been used, as de- 20 ceeds to a block 428, which represents the assignment of 
scribed in conjunction with block 334 of FIG. 5a. Deci- the next one of the remaining supernumerary bit paths 
sion block 368 of FIG. 5b evaluates the number of su- to the currently defective bit The logic then returns to 
pernumerary bits remaining, and reroutes the logic flow block 412 by logic path 430. The loop including blocks 
by way of the YES output to block 370 if paths remain 412, 414, 416, 420, 422, 424, 426 and 428 continues to 
available. Block 370 represents assignment of the next 25 assign currently available supernumerary bits to cur- 
one of the available supernumerary bit paths to the rently identified defective data bits until the number of 
currently identified defective bit. This means that the supernumerary data paths is exhausted, whereupon the 
multiplexers of all the controlled interface units 102 and logic is rerouted by decision block 426 to leave by the 
104 of FIG. 3 are reconfigured to route the defective bit NO output. From the NO output of decision block 426, 
over the selected supernumerary path. From block 370, 30 the logic flows to a block 434, which represents the 
the logic flows back to block 354 by way of logic path assignment of a non-remappable error or fault, and the 
372. The currently identified defective bit having been logic then flows to an END block 436. 
rerouted, operation continues, with the logic traversing In operation of the system as so far described, a fault 
the loop including blocks 354, 356, the NO output of in a data line may be detected either by the EDAC/- 
decision block 358, and logic path 360 back to block 35 parity features, or by test programs. The system control 
354, until the next hard error occurs Each time a hard computer (8 of FIG. 3) interrogates control interface 
error occurs, a supernumerary path is assigned by the block 134 of controlled interface units 104 at the receiv- 
flow of FIG. Sb, until no more supernumerary bit paths ing end of the data transmission to determine if an error 
aire available, and a further hard error occurs. When the in a received signal has occurred. Each control inter- 
next hard error occurs following assignment of the last 40 face block 134 includes status and error logging regis- 
of the supernumerary bit paths by block 378 of FIG. Sb, ters which, for the control interface block at controlled 
the logic will be rerouted by the NO output of decision interface units at the data receiving end of communica- 
block 368 to a block 374, which represents the disabling tion transmissions, indicate the presence of a parity or 
of both the EDAC and the parity error coding. The EDAC error, and its bit location. The system control 
logic then flows to a block 376, in which one of the 45 computer reads the status and error logging registers, 
newly freed supernumerary bits is assigned to the defec- and responds to the existence of an error at a given 
tive data bit From block 376, the logic flows to a trans- controlled interface unit 102jr, 104jr, by interrogating its 
fer block 376, .which represents a transfer to corre- control interface block 134 to determine the error loca- 
sponding block B of FIG. 5c. tion, namely the particular bit of the N bits which is in 
If the control logic is started in a no-error-correction 50 error. Once the bit position is known, the control com- 
mode at START block 400 of FIG. 5c, the error log- puter instructs all the control interface blocks 134 at the 
ging registers are initialized, and both EDAC and parity controlled interface units 102*, 104* to command (a) 
error coding are disabled in a block 410. From block allocation of a spare bit path in the associated multiplex 
410, the logic flows to a block 412, which is the starting unit 120 to the defective bit, and (b) instructs the control 
point of the logic transferred from the logic of FIG. 56 55 interface block 134 of all other controlled interface units 
by way of B transfer block 378. In FIG. 5c, logic block 102, 104 to reconfigure their multiplex units 106 in a 
412 represents commands which allow normal data corresponding manner. For example, if only one bit is in 
transmissions. Block 414 represents commands which error, the reconfiguration of multiplex unit 120 of FIG. 
break the normal data transfer, or which, during normal 4c might select the available spare bit paths in sequence, 
breaks in the data transfer, represent commanding of 60 by placing one-of-many multiplexer 220 tin a non-block- 
test transmissions by test transmission generator 160 of ing mode, to pass the bit of the particular significance in 
FIG. 4a, and reception of the expected data patterns by which the error was detected. Thus, if the error was in 
corresponding test transmission receivers. From block the second most significant bit of the N-bit data, the 
414, the logic flows to a decision block 416, which second most significant bit of multiplexer 220] would be 
evaluates the results of the test transmissions and recep- 65 enabled. Thus, one of the controlled interface units will 
tions. If no errors are identified, the logic leaves deci- be at a "transmitting" location, and at that location the 
sion block 416 by the NO output, and proceeds back to second-MSB of the data is transmitted over the Si spare 
block 412 by logic path 418. In the event that an hard bit path. At all of the other interface units 102* or 104* 
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of FIG. 3, the instructions from the control computer, 
by way of the control interface block 134 in each con- 
trolled interface unit 102 or 104, also command the 
enabling of multiplexer 206^ of FIG. 4b, which responds 
by blocking the second MSB of the N-bit data path, and 5 
passing the data from the Si spare bit path to its output 
Thus, the data which would normally traverse the de- 
fective second-MSB path in any transmission is re- 
routed over the intact spare bit path. By commanding 
ail controlled interface units to "spare" the same bit, 10 
communication is guaranteed for all transmit-receive 
pairs, although some pairs may not require the spare 
path. 

When a second path error is detected in the system, 
the control computer again detects the presence of the 15 
error and its bit location, and commands all the con* 
trolled interface units to allocate the second spare bit to 
the new defective bit position. This does not change the 
previous spare bit allocation, so that both spare bit paths 
are used for all transmit-receive pairs, even though not 20 
all of them require the spare. Clearly, the number of 
defective bit paths which can be accommodated in this 
manner equals the number of spare bit paths available. 
As so far described, the invention advantageously al- 
lows the use of controlled interface units external to the 25 
crossbar switch to improve the reliability of the com- 
munication system, without the addition of further 
crossbar switches. However, it does require a "wider" 
crossbar switch, with more bit paths than the minimum 
required to carry the basic data stream. 30 

In FIG. 4a t a control EDAC block 140 provides 
"handshake" information between the interface unit and 
other interface units, or to the crossbar switch. In gen- 
eral, this requires at least two bit lines for control signal, 
one outgoing and one incoming. Since these two bit 35 
paths would constitute potential single-point failure 
possibilities, a plurality, such as ten (five in, five out) bit 
paths are provided, and majority thrce-of-five voting is 
performed in block 140 to provide fault tolerance to 
control-line failures. 40 

A memory 130, termed a "memory spare map" is 
coupled by command bus 14$ to control interface 134, 
EDAC/parity encoder block 126, and multiplexers 106 
and 120, for purposes described below. A test transmis- 
sion generator 160 is coupled by data path 128 to 45 
EDAC/parity encoder block 126, for, on command, 
generating a sequence of test transmissions, such as 
"walking ones", for applying test signals to the individ- 
ual bit paths of that portion of the communication sys- 
tem receiving the transmissions. 50 

FIG. 6 is a simplified block diagram of a communica- 
tion system according to an aspect of the invention. The 
arrangement of FIG. 6 is similar to FIG. 3, but differs, 
in that some of the stations and associated interface 
units, namely stations 12i-12jc and associated interface 55 
units 102 1-102^, are replaced by individual banks of 
random access memory (RAM) 602i, 6022, 6023, . . . 
602/:. As mentioned previously, there was no difference 
between stations 102 and 104 in FIGS. 1 and 3, and 
similarly the replacing of all of stations 12 by memory is 60 
arbitrary, less than all stations 12 could be replaced with 
memories, or all of stations 12, and some of stations 14 
could have been replaced. The arrangement of FIG. 6 
allows communication between stations 14i-14£ and 
memories 602 1-602*, corresponding to memory banks 1 65 
through I, by way of M-bit data paths 615 1 -615/,, cross- 
bar switch 116 acting as a memory bank selector, and 
M-bit data paths 6131-613*. Such a system might find 



use, for example, in a communication system in which 
data must be both manipulated and stored. 

Each memory bank 602\-4>Q2j( may be considered to 
include P addresses, in which each address identifies 
storage locations for a plurality of words. When a word 
is stored, it may be stored with error coding bits, to 
allow error detection and correction, or with parity 
coding, to at least identify the presence of an error. 
According to an aspect of the invention, the memory is 
arranged with additional bits at each of the P memory 
addresses, so that extra bits can be stored, over that 
number of bits required to store the actual data words. 
For example, if each data word to be stored is N bits 
long, which in an example is sixty-four bits, and C 
(eight) bits of EDAC error coding are desirably associ- 
ated with each data word, the number of bit storage 
locations for each word would be M, where 
M=N-f C+Sr, and Sr represents true supernumerary 
bit storage locations. As described below, this arrange- 
ment, together with a "column spare" memory in at 
least some of the controlled interface units 104jr(and in 
some of controlled interface units 102*, if appropriate), 
allows operation to continue despite column memory 
defects in the memory banks. 

In FIG. 6, memory bank 1, designated 602 1, has its 
word address locations illustrated as being along the left 
side of the block, and the bit locations associated with 
each word are laid out along the top of the block, rang- 
ing from bit location 1 through bit location M. The 
"columns" of bit memory locations are illustrated 
within each memory block 602^- by a vertical line. For 
example, the first bit location column in memory 602 1 is 
represented as a solid vertical line 604. Similarly, the 
third bit location column is represented as a solid verti- 
cal line. Within each memory 602 of FIG. 6, the status 
of the column of bit storage locations is represented by 
the type of line; a solid line represents an operable col- 
umn of bit storage locations, and a dash line represents 
a defective column of bit storage locations. Thus, col- 
umn bit storage locations (columns) 1 and 3 of memory 
602 1 of FIG. 6 are operable, but column 2 is defective, 
and column 6 is also represented as being defective. 
Additional or supernumerary column storage locations 
could be associated with each memory bank, sufficient 
in number to compensate for the expected maximum 
number of defective columns in that particular memory. 
Thus, if a maximum of two defective columns were 
expected to be associated with each memory, two addi- 
tional columns of memory bit storage locations could be 
added, so that the number of storage bits M of each 
word would be M= N-f-C +Sr, where S7* equals two, 
and M in the example would be seventy-four. The two 
St columns, being spares, are then placed in service 
when other columns become defective. 

It is not to be expected that the same columns of bit 
storage locations will become inoperative in each of the 
memories. Thus, in memory bank 1, bits 2 and 6 are 
illustrated as being defective. In memory bank 2 (602^), 
bit columns 4 and 9 are defective, in memory bank 3 
(6O23), columns 2 and 7 are defective, and in general, as 
in memory bank I (602*), columns X and Y are defec- 
tive. A scheme such as that described in conjunction 
with FIGS. 3, 4a, 46, 4c, 4d, 4e, 5a, 56, and 5c could be 
used to spare the extra columns of the memories. In 
such an arrangement, when columns 2 and 6 of memory 
bank 1 became defective, as determined by return of 
their stored data with particular bits corrupted, columns 
2 and 6 of memory bank 1 would be replaced by spare 
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column M and its preceding column M— 1 (not illus- information is stored in memory bank 1, with spare bit 
trated). The corresponding columns 2 and 6 of all the columns M and M - 1 being used to store the bits which 
other memories are also replaced with their columns would otherwise be applied to defective bit columns 2 
corresponding to the above-mentioned columns M — 1 and 6, and the stored information is put back in the 
and M. In the example of two spare columns, all the 5 correct order when retrieved, 
spares are now used up, and recourse would be had to Control unit 8 transfers column sparing information 
dropping the error coding to free additional bit columns which may have been determined by a particular inter- 
to handle any additional column defects- If a larger face unit to the corresponding portions of memories 130 
number of spare bit columns than two is available, more of all the other controlled interface units 104. Thus, all 
defective columns can be corrected, of course, but if an 10 controlled interface units 104 of FIG. 6 are loaded with 
average of two defective columns per memory bank is the same bit column sparing information. Thus, any 
expected, for example, and there are sixty-four memory other station 104, such as station 104/,, which communi- 
banks (1=64), the provision of spare columns would at cates with memory bank 1 to write or read data, spares 
least double the required size of each memory bank. the same bits, so that the communication system oper- 
According to a further aspect of the invention, mem- IS ates without corrupting when reading to memory or 
ory 130, associated with each of controlled interface writing from memory, from different locations, 
units 104*, is arranged to map the defective columns in If station 14i of FIG. 6 wishes to write to, or read 
memory banks 1-1 of FIG. 6. As illustrated in FIG. 6, from, memory bank 2, memory 130 of controlled inter- 
column spare memory 130 includes storage locations face unit 104] is addressed at its << bank 2" address, and it 
for each bank of memory, designated 1, 2, 3 t ... I. A 20 responds with the digits 4 and 9, corresponding to de- 
plurality of storage locations are associated with each fective bit columns 4 and 9 of memory bank 2. The 
bank storage location of memory 130 of FIG. 6, for defective bit column information is applied to multi- 
storing the locations of the defective bit columns of the plexers 106 and 120, which spare the corresponding bits 
memory banks. For example, memory 130 has the digits of the data transmitted to memory bank 2 for storage 
2 and 6 stored at memory locations associated with 25 therein, or read therefrom. Similarly, transmissions be- 
memory bank 1, thereby indicating that those bit col- tween station 14] and memory banks 3 and 1 result in 
umns of memory bank 1 are defective and must be sparing bits 2, 7 and X, Y, respectively. Thus, two spare 
spared. Similarly, the storage locations associated with bit columns per memory bank can provide sufficient 
memory bank 2 in memory 130 contain the digits 4 and additional memory to provide uncorrupted data in the 
9, corresponding to the defective bit columns in mem- 30 presence of two defective data bit columns in each 
ory bank 2. Also in memory 130, bit columns 2 and 7 of memory. 

memory bank 3 are represented as being defective, and In general, whenever the error coding or column 
bit columns X and Y of memory bank L Unlike the sparing configuration of a memory bank is changed, the 
arrangement described in conjunction with FIGS. 3, 4a, data stored therein must be flushed and replaced under 
4b, 4c, 4d, 4e, 5c, 56 and 5c, the arrangement of FIG. 6 35 the new configuration or regime. In the event that one 
adapts the sparing to the particular memory bank being of the memory banks 602i-602*r is subject to an addi- 
addressed, so that the spare columns of each memory tional defective bit column over the two for which 
bank can be allocated to the defects of only that particu- spares were allocated, additional spare bit columns can 
lar memory bank, and no other. Thus, when station 14i be made available by eliminating the use of advanced, 
accesses memory bank 1 by way of its controllable 40 bit-intensive error coding, and substituting error coding 
interface unit 104] for storage of data bits therein, and requiring fewer bits. If, for example, an additional de- 
the defective bit column locations in memory bank 1 fective bit column occurs in memory bank 1, the ED AC 
have previously been determined and stored in memory coding stored therein with each data word before the 
130, column spare memory 130 is addressed at its "bank time the defect occurs will, when read, give notice of 
1" address, and the defective bit column information 45 the defective bit, and correct the data. When a hard 
("2" and "6** in the example) is applied to control multi- error has been determined or detected and no additional 
plexers 106 and 120. However, only multiplexer 120 spare bit paths are available, the memory can be flushed 
receives data bits at its input, and no signal arrives at the of its current data and its EDAC coding, and reloaded 
input of multiplexer 106. Consequently, multiplexer 106 with the same data, with parity coding instead of 
does nothing. Multiplexer 120 operates as described in 50 EDAC coding, as described above, with the additional 
conjunction with FIG. 4b, to reroute the defective bits defective data bit rerouted onto one of the bit columns 
applied to the input port over spare bit paths, so that bits freed for other use by dropping back from EDAC to 
2 and 6 of each data word, which would normally be parity coding. Column spare memory 130 would then 
applied to, and stored in, bit locations 2 and 6 of each require additional storage locations associated with 
word address of memory bank 1, are instead rerouted 55 each memory bank 602jf in which to store the sparing 
by spare bit paths S\ and S2 to column spare locations M information permitted by the additional freed paths. It 
and M— 1, respectively, of memory bank 1. When sta- would not even be necessary, in some cases, to have 
tion 14] addresses memory bank 1 by way of controlled true supernumerary or spare bit columns, so long as it 
interface unit 104] for reading data therefrom, memory were acceptable to switch to parity coding immediately 
130 is addressed, and again applies the defective bit 60 upon occurrence of a bit column defect 
column information to multiplexers 106 and 120, to According to one aspect of the invention, each word 
configure them for sparing bits 2 and 6. Consequently, transmitted by a station of FIG. 6 for storage in memory 
when memory bank 1 responds with the data from the is routed to a different one of memories 602. For exam- 
addressed location, including the correct bit 2 and bit 6 pie, if ten sequential words are transmitted by station 
information previously stored at spare bit column loca- 65 104 1 through crossbar switch 116 for storage, the first 
tions M and M — 1, multiplexer 106 remaps the spare bits word is directed to memory bank 1, the second word is 
to the bit 2 and bit 6 locations of the spared data, as directed to memory bank 2, the third word is directed to 
described in detail in conjunction with FIG. 4b. Thus, memory bank 3, and so forth, until the tenth word is 
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stored in the tenth memory bank (not illustrated). This 
in turn means that crossbar switch switches after each 
word, to redirect the next word, and it also means that 
the sparing configuration changes after each word. A 
word— cnt or "bank select" block 132 is illustrated in 5 
FIG. 4, which is connected to memory 130. Block 132 
keeps track of the particular memory bank associated 
with each word being stored. More particularly, since 
the number of memory blocks is known, block 132 
counts the words transmitted modulo, thereby identify- 10 
ing the memory associated with each word, without the 
necessity for keeping a log. 

As so far described, the memory sparing map is used 
to reconfigure the column sparing configuration of the 
multiplexers in the context of memory blocks, to allow 15 
continued operation in the presence of multiple errors at 
each memory location. The "memory" sparing map can 
be used in the same fashion to dynamically reconfigure 
the sparing bit path allocation depending upon the sta- 
tions which are communicating, to allow continued 20 
communication in the presence of more than S total 
system failures of bit paths between stations fitted with 
station interface units, so long as there are not more 
than S failures in the bit paths between the communicat- 
ing stations. 25 

The flow chart of FIGS. 5a, 5* and Sc is generally 
applicable to the arrangement of FIG. 6, with the un- 
derstanding that the determination of the presence of 
errors and the existence of spare bit paths is made on a 
bank-by-bank (or station-to-station) bask. 30 

Other embodiments of the invention will be apparent 
to those skilled in the art In particular, the number of 
supernumerary bits may be selected as desired. The 
inventive scheme as described may be used in conjunc- 
tion with a redundant crossbar switch arrangement 35 
which is not collocated with the first multiplex switch 
arrangement, for an ultra-reliable communications sys- 
tem. While interface units 102 are illustrated as blocks 
separate from station blocks 12 in FIG. 3, they may be 
collocated or located in the same block. Switching 40 
between EDAC and parity error coding schemes has 
been described, but switching may be accomplished 
among three or more coding schemes, as desired, which 
might be for example, 8-bit EDAC, 2-bit parity, and 
1-bit parity. The inventive arrangement may be ex- 45 
tended, in certain cases where amplitude-representative 
data is being communicated, and when additional de- 
fects occur after all true supernumerary bits have been 
allocated, and all the error coding has been dispensed 
with and the freed-up bit paths allocated, by jettisoning 50 
the LSBs of the data signal, and by routing the remain- 
ing bits of greater significance through the available bit 
paths. The reliability of the described communication 
systems may be enhanced, if desired, by using redun- 
dant control blocks 8 and interconnecting buses. 55 

What is claimed is: 

1. A system for communicating by the flow of data 
among a plurality of stations, comprising: 

a plurality of stations, each for transducing digital 
data in N-bit parallel bit form on an associated data 60 
path, where N is a plurality, each bit of a particular 
significance transduced by any one of said stations 
being associated with a particular bit path of said 
associated data path; 

switch means, said switch means including a number, 65 
equal to said plurality, of M-bit data ports, where 
M is a plurality, for switching multibit parallel 
paths for the flow of data between at least pairs of 



20 

said data ports of said switch means, any one of said 
switched multibit parallel paths being subject to 
failure in the event that one of said bits thereof 
becomes nontransmissive, each of said multibit 
parallel paths including said plurality M of bit 
paths, which plurality M exceeds said plurality N 
by a number S of supernumerary bit paths, which 
number S is at least one; 
a plurality of interface units, equal in number to the 
number of said plurality of stations, each of said 
interface units being associated with one of said 
stations and with one of said data ports of said 
switch means, each of said interface units compris- 
ing (a) a plurality N of controllable single-bit 
(S+l)-to-one multiplexing means, each of said 
(S + l)- to-one multiplexing means including at least 
a single-bit first input port, an S-bit second input 
port, and an output port, for coupling said first 
input port to said output port in a first control state, 
and for coupling one bit of said second input port 
to said output port in a second control state, said 
first input port of each of said (S+ l)-to-one multi- 
plexing means being coupled to one of said N bit 
paths of said associated one of said data ports of 
said switch means, said bit paths of said second 
input port of each of said (S*f I)-to-one multiplex- 
ing means being coupled to said supernumerary bit 
paths of said associated one of said data ports of 
said switch means, and said output port of each of 
said (S-f l)-to-one multiplexing means being cou- 
pled to one of said N bit paths of said associated 
one of said stations, for, in said first control state of 
said (S+l)- to-one multiplexing means of one of 
said interface units, coupling information arriving 
on said one of said N bit paths of said associated 
one of said data ports of said switch means to a 
corresponding one of said N bit data paths of said 
associated one of said stations, and for, in said sec- 
ond control state of said (S+ l)-to-one multiplexing 
means of said one of said interface units, coupling 
information arriving from said supernumerary bit 
paths of said M bit paths of said associated one of 
said data ports of said switch means to one bit path 
of said N-bit data paths of said associated one of 
said stations, and (b) an N-bit data path coupled for 
data flow from said N bit data paths originating at 
said associated one of said stations to N bits of the 
associated one of said data ports of said switch 
means, whereby, in the absence of a failure in any 
one of said N bits of said switched multibit parallel 
paths, N bits of data transmitted by said associated 
one of said stations is coupled over said N bits of 
said switched multibit parallel paths to a remote 
one of said stations, but in the presence of a break 
in one of said N bits of said switched multibit paral- 
lel paths, one of said N bits of data transmitted by 
said associated one of said stations may fail to reach 
said remote station, resulting in a communication 
failure, and (c) a plurality S of controllable oneof- 
many multiplexing means, each of said one-of- 
many multiplexing means including an N-bit input 
port and a single-bit output port, each of said bits of 
said N-bit input port of each of said one-of-many 
multiplexing means being coupled to one of said N 
bit data paths originating at said associated one of 
said stations, and said output port of each of said 
one-of-many multiplexing means being coupled to 
one of said supernumerary bit paths of said associ- 
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ated one of said data ports of said switch means, 
whereby each of said N bits of data transmitted by 
said associated one of said stations may be coupled 
by said S one-of-many multiplexing means to any 
one of said S bits of said associated one of said data 5 
ports of said switch means; and 

control means coupled to said plurality of interface 
units, for identifying a nontransmissive one of said 
N bit data paths extending through said switch 
means between any pair of said stations, for con- 10 
trolling said (S + l)-to-one multiplexing means as- 
sociated with said nontransmissive bit path to said 
second state, and for controlling said one-of-many 
multiplexing means to couple to one of said super- 
numerary bit paths said one of said bits coupled to IS 
its input port which is associated with said non- 
transmissive bit path. 

2. A system according to claim 1, wherein each of 
said interface units further comprises: 

error coding means including an N-bit input port and 20 
a C-bit output port, said input port of said error 
coding means being coupled to receive said N-bit 
signal from said associated one of said stations, and 
said output port of said error coding means being 
coupled to C of said S input ports of each of said 25 
plurality S of controllable one-of-many multiplex- 
ing means, for generating C-bit error coding signals 
in response to said N data bits, and for applying 
said C error coding bits to said controllable one-of- 
many multiplexing means, whereby said error cod- 30 
ing signals may be coupled over C bit paths of said 
S supernumerary bit paths extending through said 
switch means to a remote station; 

error signal decoding means including an (N-j-C)-bit 
input port and an N-bit output port, for receiving 35 
N-bit data signals on an N-bit portion of said 
(N + C)-bit input port from a remote station by way 
of said plurality N of controllable single-bit (S+ 1)- 
to-one multiplexing means, and for receiving a 
plurality C of error coding signals related to said 40 
N-bit data signals on a C-bit portion of said 
(N+Q-bit input port, for at least detecting the 
presence of errors in said N-bit data signals, and for 
coupling N-bit data signals to said output port of 
said error signal decoding means; 45 

a further plurality C of controllable single-bit (S+ 1>- 
toone multiplexing means, each of said further 
plurality of controllable (S+ l)-to-one multiplexing 
means including at least a single-bit first input port, 
an S-bit second input port, and an output port, for 50 
coupling said first input port to said output port in 
said first control state, and for coupling one bit of 
said second input port to said output port in said 
second control state, said first port of each of said 
further plurality C of (S-j-l>to-one multiplexing 55 
means being coupled to one of C of said supernu- 
merary bit paths of said associated one of said data 
ports of said switch means, said bit paths of said 
second input port of each of said C (S-h 1}- to-one 
multiplexing means being coupled to said S super- 60 
numerary bit paths of said associated one of said 
data ports of said switch means, for, in said first 
control state of said further (S+ l)-to-one multi- 
plexing means, coupling bits arriving on said one of 
said C supernumerary bit paths of said associated 65 
one of said data ports of said switch means to a 
corresponding one of said C-bit portions of said 
(N+Q-bit input port of said error signal decoding 
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means, and for, in said second control state of said 
(S+l>to-one multiplexing means of said one of 
said interface units, coupling information arriving 
from said supernumerary bit paths of said M bit 
paths of said associated one of said data ports of 
said switch means to one bit path of said N-bit 
portion of said error signal decoding means, 
whereby said N-bit data signal is coupled to said 
associated one of said stations. 

3. A system according to claim 2, wherein said error 
signal decoding means includes error correction and 
detection means for using said error coding signal for 
correcting errors in said N-bit data signal applied 
thereto to generate a corrected N-bit data signal, and 
for applying said corrected N-bit data signal to said 
output port of said error signal decoding means. 

4. A system according to claim 3, wherein said error 
signal decoding means includes: 

parity checking means for detecting errors in said 
N-bit data signal applied thereto to generate error 
information; 

coupling means for coupling uncorrected N-bit data 
signal to said output port of said error signal decod- 
ing means; and 

control means for selectively enabling one of said 
parity checking means and said error detection and 
correction means. 

5. A method for communicating N-bit signals among 
N-bit stations, comprising the steps of: 

applying said N-bit signals from a transmitting station 
to an N-bit portion of an M-bit switching system 
including M bit paths extending through said 
switching system, where M is greater than N by a 
number S of supernumerary bit paths; 

in the event of a failure in one of said N bit paths of 
said switching system, routing that one bit of par- 
ticular significance, of said. N bits transmitted by 
one of said stations, which may, as a result of said 
failure, arrive at a receiving station as a defective 
bit, to one of said supernumerary paths extending 
through said switching system, to thereby form a 
rerouted bit; 

at a receiving station of said communication system, 
routing all of said N bits received from said N-bit 
portion of said M bit paths extending through said 
switching system, except said one bit of said partic- 
ular significance, to the corresponding bit positions 
of said N-bit receiving station; 

at said receiving station of said communication sys- 
tem, routing said rerouted bit from said supernu- 
merary path extending through said switching sys- 
tem to a bit position corresponding to said particu- 
lar significance; 

generating C error coding bits from said N-bit signal 
at said transmitting station; 

applying said error coding bits from said transmitting 
station to a C-bit portion of said M-bit switching 
system, where C is less than S; 

in the event of a failure in an additional one of said N 
bit paths of said switching system, to create an 
additional defective bit of another significance at a 
time when a number equal to S — C of said supernu- 
merary bit paths are in use carrying defective bits, 
reducing the number of said error coding bits to 
less than C, thereby freeing at least one of said 
supernumerary bit paths, and routing said addi- 
tional defective bit to said one of said supernumer- 
ary paths freed by said reduction in the number of 
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error coding bits, to thereby form an additional 
rerouted bit; 

at a receiving station of said communication system, 
routing said-additional rerouted bit from said one of 
said supernumerary bit paths freed by reduction of 5 
said error coding to a bit position corresponding to 
said other significance. 
6. A method according to claim 5, further comprising 
the steps of: 

in the event of a failure in a yet further one of said N 10 
bit paths of said switching system, to create a yet 
further defective bit of yet another significance at a 
time when a number equal to S of said supernumer- 
ary bit paths are in use carrying defective bits, 
reducing the number of said error coding bits to 15 



zero, thereby freeing at least one of said supernu- 
merary bit paths, and routing said yet further de- 
fective bit to said one of said supernumerary paths 
freed by said reduction in the number of error 
coding bits to zero, to thereby form a yet further 
rerouted bit; 

at a receiving station of said communication system, 
routing said yet further rerouted bit from said one 
of said supernumerary bit paths freed by reduction 
of said error coding to zero to a bit position corre- 
sponding to said yet further significance. 

7. A system according to claim 1, wherein at least 

some of said stations comprise banks of memory. 
***** 
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